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Abstract 

This work considers the infinite-time discounted optimal control problem for con¬ 
tinuous time input-affine polynomial dynamical systems subject to polynomial state 
and box input constraints. We propose a sequence of sum-of-squares (SOS) approxi¬ 
mations of this problem obtained by first lifting the original problem into the space of 
measures with continuous densities and then restricting these densities to polynomials. 

These approximations are tightenings, rather than relaxations, of the original problem 
and provide a sequence of rational controllers with value functions associated to these 
controllers converging (under some technical assumptions) to the value function of the 
original problem. In addition, we describe a method to obtain polynomial approxi¬ 
mations from above and from below to the value function of the extracted rational 
controllers, and a method to obtain approximations from below to the optimal value 
function of the original problem, thereby obtaining a sequence of asymptotically opti¬ 
mal rational controllers with explicit estimates of suboptimality. Numerical examples 
demonstrate the approach. 

Keywords: Optimal control, nonlinear control, sum-of-squares, semidefinite programming, 
occupation measures, value function approximation 

1 Introduction 

This paper considers the inhnite-time discounted optimal control problem for continuous¬ 
time input-affine polynomial dynamical systems subject to polynomial state constraints and 
box input constraints. This problem has a long history in both control and economics 
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literature. Various methods to tackle this problem have been developed, often based on the 
analysis of the associated Hamilton-Jacobi-Bellman equation. 

In this work we take a different approach: We hrst lift the problem into an inhnite-dimensional 
space of measures with continuous densities where this problem becomes convex; in fact a 
linear program (LP). This lifting is a tightening, i.e., its optimal value is greater than or 
equal to the optimal value of the original problem, and under suitable technical conditions 
the two optimal values coincide. This infinite-dimensional LP is then further tightened by 
restricting the class of functions to polynomials of a prescribed degree and replacing nonneg¬ 
ativity constraints by sufficient sum-of-squares (SOS) constraints. This leads to a hierarchy 
of semidefinite programming (SDP) tightenings of the original problem indexed by the de¬ 
gree of the polynomials. The solutions to the SDPs yield immediately a sequence of rational 
controllers, and we prove that, under suitable technical assumptions, the value functions 
associated to these controllers converge from above to the value function of the original 
problem. 

We also describe how to obtain a sequence of polynomial approximations converging from 
above and from below to the value function associated to each rational controller. Combined 
with existing techniques to obtain polynomial under approximations of the value function of 
the original problem (adapted to our setting), this method can be viewed as a design tool 
providing a sequence of rational controllers asymptotically optimal in the original problem 
with explicit estimates of suboptimality in each step. 

The idea of lifting a nonlinear problem to an infinite-dimensional space dates back at least 
to the work of L. C. Young [25] and subsequent works of Warga [26|, Vinter and Lewis [2^ . 
Rubio [22| and many others, both in deterministic and stochastic settings. These works 
typically lift the original problem into the space of measures and this lifting is a relaxation 
(i.e., its optimal value is less than or equal to the optimal value of the original problem) and 
under suitable conditions the two values coincide. 

More recently, this infinite-dimensional lifting was utilized numerically by relaxing the infinite¬ 
dimensional LP into a finite-dimensional SDP [T3] or finite-dimensional LP [1] . Whereas the 
LP relaxations are obtained by classical state- and control-space gridding, the SDP relax¬ 
ations are obtained by optimizing over truncated moment sequences (i.e., involving only 
finitely many moments) of the measures and imposing conditions necessary for these trun¬ 
cated moment sequences to be feasible in the infinite-dimensional lifted problem. These 
finite-dimensional relaxations provide lower bounds on the value function of the optimal 
control problem and seem to be difficult to use for control design with strong convergence 
guarantees; a controller extraction from the relaxations is possible although no convergence 
(e.g., 0 B) or only very weak convergence can be established (e.g., m na in the related 
context of region of attraction approximation). 

Contrary to these works, in this approach we tighten the infinite-dimensional LP by op¬ 
timizing over polynomial densities of the measures and imposing conditions sufficient for 
these densities to be feasible in the infinite-dimensional lifted problem, thereby obtaining 
upper bounds as opposed to lower bounds. Crucially, to ensure that polynomial densities of 
arbitrarily low degrees exist for our problem (and therefore the resulting SDP tightenings 
are feasible), we work with free initial and final measures and set up the cost function and 
constraints such that this additional freedom does not affect optimality. Importantly, we 
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do not assume that the state constraint set is control invariant, a requirement that is often 
imposed in the existing literature (e.g., [19]) but rarely met in practice. 

The presented approach bears some similarity with the density approach of 1211 for global 
stabilization later extended to optimal control (in a purely theoretical setting) in [19] and 
recently generalized to optimal stabilization of a given invariant set in [20] (providing both 
theoretical results and a practical computation method). However, contrary to [21] we 
consider the problem of optimal control, not stabilization and moreover we work under 
state constraints. Contrary to cm we work in continuous time, consider a more general 
problem (optimal control, not optimal stabilization of a given set) and our approach of 
hnite-dimensional approximation is completely different in the sense that it is based purely 
on convex optimization and it does not rely on state-space discretization. Moreover, and 
importantly, our approach comes with convergence guarantees. 

Finally, let us mention that this work is inspired by [12], where a converging sequence of 
upper bounds on static polynomial optimization problems was proposed, as opposed to a 
converging sequence of lower bounds as originally developed in m- 


2 Preliminaries 

2.1 Notation 

We use L{X;Y) to denote the space of all Lebesgue measurable functions dehned on a 
set W C M” and taking values in the set Y C M™. If the space Y is not specihed it is 
understood to be M. The spaces of integrable functions and essentially bounded functions 
are denoted by L^{X;Y) and L°°{X;Y), respectively. The spaces of continuous respectively 
fc-times continuously differentiable functions are denoted by C{X] Y) respectively C^{X] Y). 
By a (Borel) measure we understand a countably additive mapping from (Borel) sets to 
nonnegative real numbers. Integration of a continuous function v with respect to a measure 
/i on a set X is denoted by J^v{x) or also J v dfi when the variable and domain of 

integration are clear from the context. A probability measure is a measure with unit mass 
(i.e., J Id/i = 1). The support of a measure /i, dehned as the smallest closed set whose 
complement has zero measure, is denoted by spt /r. The ring of all multivariate polynomials 
in a variable x is denoted by M[x], the vector space of all polynomials of degree no more than 
d is denoted by and the vector space of m-dimensional polynomial vectors is denoted 

by The boundary of a set X is denoted by dX, the interior by X° and the closure by 

X. The Euclidean distance of a point x from a set X is denoted by distx(a:). For a possibly 
matrix-valued function / G C(X; M”^™) we dehne ||/||co(x) := sup 3 ,gj(^ maxjj \ fij{x)\ and for 
a vector-valued function g G C^(X;M"^) we dehne Us'Hcqx) := ||5'||co(x) + ||^||co(x:), where 
denotes the Jacobian of g. If clear from the context we write || • ||co for || ■ ||co(x) and 
similarly for the norm. 
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2.2 SOS programming 


Crucial to the material presented in the paper is the ability to decide whether a polynomial 
p G M[x] is nonnegative on a set 

X = {x G M" I gi{x) >0, i = 1,..., Ug}, 

with Qi G M[x]. A sufficient condition for p to be nonnegative on X is that it belongs to the 
truncated quadratic module of degree d associated to X, 

na 

Qd{d<.) := |so + ^Pi(x)sj(x) I So G T,2^d^,Si G S^|^ (d_degg.) j 
2=1 

where T, 2 k is the set of all polynomial sum-of-squares (SOS) of degree at most 2k. Note in 
particular that Qd+i{X) D Qd{X). Ifp G Qd(X) for some d > 0 then clearly p is nonnegative 
on X, and the following fundamental result shows that a certain converse result holds. 

Proposition 1 (Putinar jl6j ) Let N — ||x|p G Qd{X) for some d > 0 and N > 0 and let 
p G ]R[x] be strictly positive on X. Then p G Qd{X) for some d > 0. 

Combining with the Stone-Weierstrass Theorem, as an immediate corollary we get: 

Corollary 1 Let f G C{X) he nonnegative on X and let N — ||x|p G Qd{X) for some 
d > 0 and N > 0. Then for every e > 0 there exists d > 0 and pd G Qd(X) such that 
\\f -PdWco < e. 

Corollary says that polynomials in Qd{X) are dense (with respect to the norm) in the 
space of continuous functions nonnegative on X when we let d tend to inhnity. 

In the rest of the text we use standard algebraic operations on sets. For instance if we write 
that p G gQd{X) + hM[x]rf, then it means that p = gq + hr with q G Qd{i^) and r G M[x]rf. 

The inclusion of p G Qd{X) for a hxed d is equivalent to the existence of a positive semideh- 
nite matrix W such that p(x) = b{x)~^Wb{x), where b{x) is a basis of M[x]rf/ 2 , the vector space 
of polynomials of degree at most d/2. Comparing coefficients leads to a set of affine con¬ 
straints on the coefficients of p and the entries of W. Deciding whether p G Qd{X) therefore 
translates to the feasibility of a semidehnite programming problem with the coefficients of p 
entering affinely. As a result, optimization of a linear function of the coefficients of p subject 
to the constraint p G Qd{X) translates to a semidehnite programming problem (SDP) and 
hence to a well-understood and widely studied class of convex optimization problems for 
which powerful algorithms and off-the-shelf software are available. See, e.g., [10] and the 
references therein for more details. 
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3 Problem statement 


We consider the continuous-time input-affin^ controlled dynamical system 

m 

+ ( 1 ) 

i=l 

where x G M"’ is the state, u G M”* is the control input, and the data are polynomial: 
/ G /„. G i = 1,... ,m. The system is subject to semi-algebraic state and box[^ 

input constraints 


x{t) G X := {x G M"" I gi{x) >0, i = 1, ..., Ug}, (2a) 

u{t) eU :=[0,u\^, (2b) 

where g G and h > 0. The set X is assumed compact and the polynomials defining 

X are assumed to be such that 

Ug 

:= > 0 VxgXT (3) 

i=l 

Since X is assumed compact, we also assume, without loss of generality, that the inequalities 
defining the sets X contain the inequality N — ||x||^ > 0 for some X > 0. 

The goal of the paper is to (approximately) solve the following optimal control problem 
(OCP): 


V{xo) := inf jQ^''°^ef^%{x{t)) + J2T=JnAx{t))ui{t)]dt + el^^M 

s.t. x{t) =Xo + /o/(a^(s)) + E™i/ni(a;(s))Mi(s)ds, 
{x{t),u{t)) e X X U Vf G [0,r(a:o)] 
u G T°°([0, t{xo)];U), t G L{X; [0, cx)]) 
where /3 > 0 is a given discount factor and M is a constant chosen such that 


M > (3 ^ sup {/(x,m)}, 
xex,ueu 


(5) 


where the joint stage cost 

m 

l{x, u) := l^{x) + ^ lui {x)ui (6) 

i=l 

is, without loss of generality, assumed to be nonnegative on X x U. The state and input 
stage cost functions and i = 1,..., m, are assumed to be polynomial. The function r 


^Any dynamical system x = f{x,u) depending nonlinearly on u can be transformed to the input-affine 


form by using state inflation 


X 


f{x,u) 

u 


V 


where u is now a part of the state and v a new control input; 


constraints on v then correspond to rate constraints on u. Similarly, cost functions depending non-linearly 
on u in problem (W|) can be handled using state inflation in exactly the same fashion. 

^Any box can be affinely transformed to [0,u]. 


5 







in OCP Q is referred to as a stopping function; the optimization is therefore both over the 
control inpnt u and over the hnal time r{xo), which can be hnite or inhnite and can depend 
on the initial condition xq. 

The fnnction x h->• V{x) in Q is called the value function. The reason for choosing the 
slightly non-standard objective fnnction in (|^ is becanse with this objective function the 
value function V is bounded (by M) on X and it coincides with the standarc^ discounted 
inhnite-horizon value function for all initial conditions xq G X for which the trajectories can 
be kept within the state constraint set X forever using admissible controls, i.e., for all xq in 
the maximum control invariant set associated to the dynamics ([^ and the constraints (|^. 
To see the hrst claim, set r(xo) = 0 for all Xq E X. To see the second claim notice that with 
M chosen as in ([^, it is always benehcial to continue the time evolution whenever possible 
and therefore r(xo) = +C)0 for all xq in the maximum controlled invariant set associated to 
([^ and ([^. 

Remark 1 A constant M satisfying 0 can he found either by analytically evaluating the 
supremum in 0 or by using the technigues of m to find an upper bound. 

Given a Lipschitz continuous feedback controller u G C{X;U) and a stopping function r G 
L{X; [0, oo]), the ODE ([^ has a unique solution and we let Vu,t G L{X; [0,oo]) denote the 
value function attained by {u,t) in (|^, i.e., setting u{t) = u{x{t)). By 14 we denote the 
value function 14,where r* G L{X; [0,oo]) is the optimal stopping function associated 
to u. Note that, by the choice of M in ([^, the optimal stopping function r* is equal to the 
hrst hitting time of the complement of the constraint set X, i.e., 

r*(xo) = inf{t > 0 | x(t|xo) ^ X}, 

where x(t|xo) is the trajectory of ([^ with u{t) = u{x{t)) starting from xq. Notice also that 
Vu,t{x) > V{x) for all X G X and that for any pair (u, r) feasible in (|^ we have Vu^t{x) < M 
for all X G X. 

Throughout the paper, we make the following technical assumption: 


Assumption 1 There exists a sequence of Lipschitz continuous feedback controllers {u^ G 
G(X; and stopping functions {r^ G T(X; [0, cxd])}^;^ feasible in Q) such that 

lim / {V^k T-k{x) — V{x))dx = 0 (7) 

Jx 

and such that for every k > 0 there exist a function G G^(X) and a scalar 7^ > 0 such 
that p^(x) = 0 if distdx{x) < 7 ^ and 



ha:o) 


e ^^v{x^{t\xo)) dtdxo = / v{x)p^{x)dx Vn G G(X), 


X Jo 


' X 


( 8 ) 


where x^{- |xo) denotes the solution to ^ controlled by . 

^By standard we mean a discounted optimal control problem with cost 
YffiLi ^Ui{x{t))ui(t)] dt and no stopping function. 
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Remark 2 Note that >V on X by construction and therefore ^ is equivalent to the 

convergence ofVy^k .^k to V. 

Assumption says that the optimal control inputs and stopping functions for OCP Q can 
be well approximated by Lipschitz continuous feedback controllers and measurable stopping 
functions such that the resulting densities of the discounted occupation measures are con¬ 
tinuously differentiable and vanish near the boundary of X. Note that the existence of an 
optimal feedback controller, as well as whether it can be well approximated by Lipschitz 
controllers, are subtle issues. Similarly it is a subtle issue whether asymptotically optimal 
stopping functions can be found such that the associated densities in ([^ are continu¬ 
ously differentiable and vanish near the boundary of X (note, however, that the left hand 
side of can always be represented as j^v{x)dyL^{x) for some nonnegative measure p^). 
This problem is of rather technical nature and has been studied in the literature (e.g., [3l 
Section 1.4] or [IH]), where affirmative results have been established in related settings. We 
do not undertake a study of this problem here and rely on Assumption [T| which is, for ease 
of reading, not stated in its most general form. For example, the functions do not need 
to be but only weakly differentiable and the integration on the left-hand side of can 
be weighted by a nonnegative function pg G Li{X) satisfying pg > 1 on X and pg —>■ 1 
in Li{X). In addition, we conjecture that it is enough to require p^ = 0 on dX and not 
necessarily on some neighborhood of dX] this is in particular the case when X is a box or 
a ball but we expect all the results of the paper to hold with a general semialgebraic set for 
which the dehning functions satisfy ([^. 

The main result of this paper is a hierarchy of sum-of-squares (SOS) problems providing an 
explicit sequence of rational feedback controllers G 0°°(X; U) such that, under Assump¬ 
tion]^ ([^ holds with = r**,, i.e., a sequence of asymptotically optimal rational controllers 
in the sense of the convergence of the associated value functions (see Remark]^. 


4 Converging hierarchy of solutions 


In this section we present an inhnite-dimensional linear program (LP) in the space of contin¬ 
uous functions whose sum-of-squares (SOS) approximations provide a sequence of rational 
controllers satisfying (p ). This inhnite-dimensional LP is closely related (and in a weak 
sense equivalent) to OCP (^; the rationale behind the derivation of the LP and its relation 
to OCP (|^ is detailed in Section 

The inhnite-dimensional LP reads 


inf 

PiPOi Pt, o- 

S.t. 


l^{x)p{x) dx + YhLi Ix dx + M pt{x) dx 

PT- Po + ldp + div(p/) + Ya=i dXiaJui) = 0 

p < 0 

Po > 1 
up > ai 

pT ^ 0 
(Tj > 0 


on dX 
on X 

on X, i = 1,. 
on X 

on X, i = 1,. 


(9) 


.., m. 
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The optimization in (|^ is over functions {p, po, PT-,cr) G C^{X) x C{X) x C{X) x 
with a = (cTi,.. 

The optimal value of (|^ will be denoted by p*. The value attained in ([^ by any tuple of 
densities {p, po, pr, cr) feasible in (|^ will be denoted by p{p, po, px, cr). 

Remark 3 (Non-uniform weighting) Note that we could have imposed po > po for any 
polynomial po nonnegative on X. Choosing a different po has no impact on the asymptotic 
convergence of the value functions established in the rest of the paper as long as po is strictly 
positive on X. It may, however, influence the speed of convergence in different subsets of X. 
In general we expect faster convergence where po is large and slower convergence where it 
is small. Choosing a non-constant po therefore allows to assign a different importance to 
different subsets of X. 


The inhnite-dimensional LP ([^ is then approximated by a hierarchy of sum-of-squares (SOS) 
problems, which immediately translate to hnite-dimensional semidehnite programs (SDPs). 

The SOS approximation of degree d of ([^ reads 


inf L lflx)p{x) dx + fx dx + M L pt{x) dx 

s.t. pT- Po + flp + div(p/) + div(cri/„J = 0 


P ^ “1“ deg^j deg5 

Po “ 1 ^ Qd{X) 

i = l,.. 

.,ng 

^P ^ pQd—deggi^^^ 

PT ^ Qd{X) 

i = l,.. 

., m 

(Ji G pQd—deg•! 

i = l,.. 

., m. 


Once a basis for M[x]d is hxed (e.g., the standard monomial basis), the objective becomes 
linear in the coefficients of polynomials p, a and px, and the equality constraint is imposed 
by equating the coefficients. The inclusions in the quadratic modules translate to semideh¬ 
nite constraints and affine equality constraints; see Section 2.2 Optimization problem (10) 
therefore immediately translates to an SDP. 


Remark 4 (Feasibility) Trivially, any feasible solution to is feasible in Also, 
problem is feasible for any d > 0. Indeed {p, po, px, cr) = (0,1,1,0) is always feasible 
in (10). See also Remark^ below. 


If non-uniform weighting of initial conditions (see Remark was required, the constraint 
Po — 1 G Qd{X) would be replaced by po — Po G Qd(,X) for a polynomial weighting function 
Po nonnegative on X. 


Given an optimal solution (p“ 


>pi 


PT"! 


a 


) to (10), we dehne a rational control law by 



p'^(x) 


Vx G X, 


l,...,m. 


( 11 ) 


The main result of the paper is the following theorem stating that the controllers u'^ are 
asymptotically optimal: 







Theorem 1 For all d > 0 we have u‘^{x) G U for all x E X and if Assumption holds, 
then 


lim / iYudlyX) — V{x)) dx = Q, (12) 

Jx 

that is, V^d -E V in Li{X) (note that V^d >V on X). 


5 Rationale behind the LP formulation (9) and proof 
of the main Theorem HI 


This section explains the rationale behind the LP problem ([^ and its relation to the OCP (|^ 
and gives the proof of Theorem]^ First, we lift the original problem Q into the space of 
measures with nonnegative densities in C{X); this lifting is problem (IM. Next we tighten 
the problem by considering only polynomials of prescribed degree and with nonnegativity 
constraints enforced via SOS conditions; this is problem I®. Importantly, the lifting ([^ 
is a tightening of the original problem (|^ as show in Theorem below. This is in contrast 
with [13] where the original problem was lifted into the space of measures and this lifting 
was a relaxation. 

To be more concrete, observe that any initial measure po) stopping function r G L{X-, [0, oo]), 
and family of trajectories {a;(- | Xojjxoev of ([^ generated by a Lipschitz controller u G 
C{X]U) give rise to a triplet of measures defined by 


v{x)dp,{x) = 


'X 



-(a:o) 


e ^^v{x{t\xo)) dt djao{xo), 


X Jo 


v{x)dp,T{x) = / e ^'^^'"°'^v{x{T{xo)\xo))dp.o{xo), 


' X 


' X 


v{x)dh'i{x) = 


IX 



■(a:o) 


e ^^v{x{t\xo))ui{x{t\xo)) dtdfio{xo). 


X Jo 


(13a) 

(13b) 

(13c) 


The measure p is called discounted occupation measure, the measure ht terminal measure 
and the measures Ui, i = 1,... ,m, control measures. These measures satisfy the discounted 
Liouville eguation 


vdp,T{x)= / vdp,Q{x)+ / (Vv ■ f — l3v) dp,{x) + 


IX 


lx 


IX 


m 


i=l 


Vv- fui dvi{x) 


(14) 


’ X 


for all V G C^{X). This follows by direct computation; see, e.g., [8]. Notice also that 
dvi^x) = Ui{x)dp,{x), i.e., Ui is absolutely continuous with respect to /i with Radon-Nikodym 
derivative equal to Ut. 

Crucially, the converse statement is also true, although we have to go from stopping functions 
to stopping measures: 


Theorem 2 (Superposition) If measures pi, po, Pt and Ui, i = 1,... ,m, satisfy with 
spt /io C X , spt pi C X and spt p,T C X and dui = Uidp. for some Lipschitz u G C{X, U), then 
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there exists an ensemble of probability measures (i.e., measures with unit mass) {TxQ}xoex 
and an ensemble of trajectories {a;(- |a:o)}xoex of the system 0 controlled withu{t) = u{x{t)) 
such that x{t \ xq) G X for all t G spt Txq and 


v{x)dp,o{x)= / v{x{0\xo))dp,o{xo), 


lx 


lx 


v{x) dp,{x) = 


lx 



oo /*r 


v{x) dpiT^x) = 


IX JQ Jo 

1*00 

3-/3t 


IX 



e ^^v{x{t\xo))dtdTxo{T)dfio{xo), 
e"^^i;(r(xo)) dTxoir) duoixo), 


X Jo 


v{x) dui{x) = 


lx 



e ^*v{x{t\xQ))ui{x{t\xo)) dtdTxoir) dfJo{xo) 


(15a) 

(15b) 

(15c) 

(15d) 


X Jo Jo 


for all V G C^{X). 
Proof: See Appendix B. 


□ 


Remark 5 (Interpretation of Theorem 2) Theorem 2 says that any measures satisfy¬ 
ing (If) are generated by a superposition of the trajectories of the dynamical system x = 
fix) + fuiix)ui{x), where the superposition is over the final time of the trajectories. 
Note that there is a unigue trajectory corresponding to each initial condition (since the vector 
field f (x) + YfiLi fuiix)ui{x) is Lipschitz) but this unigue trajectory can be stopped at multiple 
times (in fact at a whole continuum of times) allowing for superposition; this superposition 
is captured by the stopping measures {TxQ}xoex- For example, if the Txq is a Dirac measure at 
a given time, then there is no superposition; if Txq has a discrete distribution, then there is a 
superposition of finitely or countably many overlapping trajectories starting at Xq stopped at 
different time instances; if Txq has a continuous distribution then there is a superposition of 
a continuum of overlapping trajectories starting from Xq stopped at different time instances. 


If in addition the measnres /jq, /i, piT satisfying the disconnted Lionville eqnation (14|) are 
absolutely continuous with respect to the Lebesgue measure with densities po ^ 
p G C^{X), pT G C{X) such that p = 0 on dX, then these densities satisfy 


C{X), 


Pt 


m 

■ po + l3p + div(/p) + dwifmai) = 

i=l 


0 


(16) 


with cTj = Uip, i = 1,... ,m. This follows directly by substituting dpo = podx, dp = pdx, 


dpT = prdx and dvi = Uidp = Uipdx = atdx in (14) and using integration by parts. 


Equation (16) holds almost everywhere in X with a Lipschitz controller u, since Lipschitz 


functions are differentiable almost everywhere and the integration by parts formula applies 
to them, and everywhere with u G C^{X;U). 

Remark 6 (Role of the terminal measure) An important feature of the SOS tighten¬ 
ings (10) is that they are feasible for arbitrarily low degrees (see Remark^, which is crucial 
from a practical point of view and is not satisfied with other, more obvious, formulations 
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(e.g., those not involving a stopping function in &>: the reason for this is that, in the ab¬ 
sence of a terminal measure (i.e., pT = ^), the discounted Liouville eguation (16) may not 
have a solution with a polynomial p even though po and the dynamics are polynomial. Indeed, 
for example with f = —x, fu, = 0, (d = 1, Po = I on X = [—1,1] and zero elsewhere, the 
only solution to & with pt = 0 is p{x) = —ln(|a:|). 

Theorem immediately enables us to prove a representation of the cost of problem ([^ in 
terms of trajectories of Q. 


Lemma 1 If {p, pq, pt,<j) is feasible in and u = a/p, then 
p{p,Po,pT,(^) = 



OO /*r 


X Jo 




JXJO 


+ M 



e ^H,,{x{t I Xo))dt dT,,^{T)po{xo) dxo 

/ e~^Hui{x{t\xQ))ui{x{t\xo)) dtdT,,g{T)po{xo)dxo 
Jo 

e~^^ dT,CQ{T)po{xo)dxQ, (17) 



x Jo 


where x{- |xo) are trajectories of 0 controlled by u{t) = u{x{t)) and are stopping proba¬ 
bility measures with support spt Txq included in [0, ooj. Moreover the state-control trajectories 
x{- I xo) and u{x{- \ xq)) are feasible in in the sense that x{t \ a:o) ^ and u{t \ xq) G U 
for all t ^ spt Txq. 

Proof: Let {p, po, pt, a) be feasible in (|^ and let p{p, po, pr, a) denote the value attained 
by {p, Po, Pt, or) in (|^. The equality constraint of ([^ is exactly (16). Since the constraint 
of (|^ implies p = 0 on dX, equation (14) holds with dpo = podx, dp = pdx, dpT = prdx 
and di>i = Uidp = Uipdx = aidx, where Ui = ^ & C^{X-,U), i = 1,... ,m. By Theorem 
(setting v{x) = lx{x) in (15b), v{x) = 1 in (15c) and v{x) = Imix) in (15d)) we obtain the 
result (noticing that the constraints of ([^ imply that ^(a;) G 17 for all x G X). □ 


Corollary 2 If {p, po, px, or) is feasible in & and u = a/p, then 

pIp,Pi,,Pt,V> / V.^{xc,)po{xo)dxc,. 


(18) 


IX 


If in addition the stopping measures {txq/xogx in 0 are egual to the Dirac measures 
{^r{xo)}xoex for some stopping function r G L{X; [0, ooj), then 


p(p, Po, Pt, tx) = / Vo,r(xo)po(xo)dxo. 


(19) 


lx 


Proof: Let {p, po, px, cr) be feasible in ([^. Using Lemma p{p, po, px, a) has representa¬ 
tion (17), where the state-control trajectories in ( p)^ are feasible in (|^. Since the measures 
Txfj in (17) have unit mass for all Xq G X, we obtain (18). If Txq = ^t(xo) for some stopping 
function r G L{X; [0, cx)]), then the integrals with respect to Txq in 10 become evaluations 
at r(xo) and hence ( [l^ holds. □ 

Corollary [^immediately implies that the problem (j^ (and hence problem (p)o|) is a tightening 
of the original problem (j^: 
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Theorem 3 The optimal value of of p* satisfies 

p*> V{x)dx. (20) 

Jx 

Proof: Follows from Corollary since po ^ 1 cind Vu >V >0. □ 

Now we are in a position to prove the following crucial lemma linking problems (|^ and (|^. 


Lemma 2 If {u^ G and {r^ G L(X; [0, oo])}^^ are respectively sequences 

of controllers and stopping functions satisfying the conditions of Assumption then the 
corresponding densities {p^, Pq, Pt^ Po = ^ are feasible in I® and satisfy 


lim p(p^po,p^,o■'') = / V{xQ)dxo. 


k^oo 


( 21 ) 


'X 


Conversely, if {p^, Pq, Pt)^^}T=i ® sequence such that limk^oo p{p^, Po, Pt^ — P* */ 

Assumption ^ holds, then equation ^ holds with /p^. 

Proof: To prove the hrst part of the statement consider the controllers u^, stopping functions 
and densities from Assumption ([^. Setting Pq = 1 and dehning erf := ufp^ and 
pfc ;= pf - fipk - div(pV) - YZ.I div(A,crf) = 0 we see that (p^ pf, p^, cr^) satisfy with 
p^ = 0 on dX. Therefore {p’^, p^, p^,a’^) are feasible in (|^. In addition, in view of (|^, the 
representation (17) holds with = 5r'=(a:o)- Therefore by Lemma 


= / V,,k^^k{xo)pQ{xo)dxo 


lx 


and hence (21) holds since satishes ([^ and pf = 1 for all P > 0. 

To prove the second part of the statement, let {p^, pf, p’f, cr^}^^ be any sequence such that 
hmfc_^ooP(p^, Po) Pt)= P*■ Then this sequence satishes (21) by Theorem and by the 
hrst part of Lemmajust proven. Therefore ([^ holds with vr := /p^ since 


P(PAPoiPti^^ ) > / V^k{xQ)dxo 

Jx 

by Corollary!^ □ 

We will also need the following result showing that nonnegative functions vanishing on 
a neighborhood of dX can be approximated by polynomials in QfiK) vanishing on dX. 

Lemma 3 Let p G C^{X) such that p > 0 on X and p = 0 on {x G X : distax(a^) < C} 
for some C > 0. Then for any e > 0 there exists d > 0 and a polynomial pd G gQd-Aegg{,X) 
such that 

llp-Pdilcqx) < e 

and Pd = 0 on dX. 
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Proof: Since 5 ^ > 0 on X°, we can factor p = 

h(x) := <1 


gh with h G C^{X) given by 

if distax(a:) > C 

otherwise. 


Since polynomials are dense in there exists for every 5 > 0 a polynomial h > 0 such that 

11 ^ ~ h\\c^{x) < (22) 

Applying Proposition to h we see that there exists G Q^{X) for some d > 0 such that 

\\h-PdWc'^ix) < (23) 

Defining pd := Pjjj we see that pd G gQd-Aegg{,X) with d = d + deg(^) and that = 0 on 
dX. Finally, 


\\p-Pd\\c<^ = Whg-PddWc^ < \\g\\c4h - Pd\W < 2(5||p||co 

and 


||Vp-Vpdilco = \\g^h + hVg -gVp^ + p^Vg\\co 

< \\^4co\\h- p4co + \\g\\c4^h-Xp4co 

< 2d(||Vpllco + lldllco)- 


Therefore choosing 6 such that 2 ( 5 ( 11 ^ 7 ^ 1100 + 2||p||co) <e gives the desired result. □ 

Now we are ready to prove our main result, Theorem [T] 

Proof (of Theorem Consider the sequences G C^{X-, {4 G L{X] [0, cxd])}^;^ 

from Assumption [T By the first part of Lemma the sequence of associated densities 
(p^, Pq, (j^) generated by 4^,4) is feasible in ([^ and satisfies (21). By Assumption]^ 
p^ = 0 and (T^ = 0 on {x G X : distax(a;) < 7*'} (since = u^p4 with 7 *^ > 0. 

Hence by Lemmajs there exist polynomial densities p^’P°^ g pQ(ij,-degg(X), (T^’P°^ g gQd,^-deggi^Y 
for some degrees dr such that 


1|P‘ - P'=-"°'||C.(A-) < 1/i 

(24) 

Ikf - A‘’°‘IIc>(a-) < 1 /*: 

(25) 

Ppfc.pol _ (jj^’P°^ g gQdf^-deg^^) for all f = l,...,m (since u^{x) E U = [0,'u]™' for all 
X G X and hence up^ > on X). Notice also that since p*^’P°^ g gQd’^-degA^)^ have 

—p^’P°i G Next, since Po > 1 and p^ > 0, we can find, by Corollary^ polynomial 

densities Po’'’°^ G 1 + Qdk{X) and py'’°^ G Qdk{X) such that 

1 Po - Po’’’°^| C0(X) < 1/fc, 

(26) 

1 Pr ~ < 1/^- 

(27) 
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Since {p^^ pQ, satisfy the equality constraint of ([^ we have 


^^A:,pol _ ^fc.pol ^ div(p'^’P°'/) + div(af’P°7«J = o;" 

i=l 


where 


i*,‘ - +,3(y’P“> - p‘) - (pS’-"' - pj)+div|(p*’P“> - p‘)/] + ^ divKy-""' - af)/„,] 


2 = 1 


is a polynomial such that ||a;*^||c '0 —)■ 0 as /c —)■ cxd in view of (24)-(27). Defining the constants 
= 1/k + llcj^^llco and setting 

fc.pol /'/c.pol I k 

Pt ■= Pt +e 


fc,pol — -fc.pol , ,k : k 

Po ■— Po + e +u 


we see that 


pfcpol ^ ^ div(p‘^>”>/) + div(a'='''”'/p,) = 0, 


2 = 1 


and — 1 and are strictly positive on X and hence belong to Qd^{X). The densities 
(^fc,poi^ ^fc,poi^ ^A:,poi) therefore feasible in (10) for some > 0. In addition, by 


construction, ||po’'’° ~ Pollco 0 and ||ph^° — Pt\ co —t 0 as /c —>■ cxd. Therefore we have 
obtained a sequence of polynomial densities (T^’P°^) that are feasible in (10) 

and such that 

lld’‘'°‘-pSllc. ^0, ||py°‘- pIIIco ^ 0, ||p*’>‘“‘-p‘||c. ^0, - a‘||c. ^ 0 

as A; —)• oo. This implies that 

- p(p^ pI Pt. ^71 ^ 0 


and hence ct*^’P°^) satisfies ( |^ and so p(p^’P°', —t p* by 

Theorem]^ Therefore ([^ holds with the rational controllers := (T^’P°*/p^’P°* by the second 
part of Lemma This finishes the proof. □ 


6 Value function approximations 

In this section we propose a converging hierarchy of approximations from below and from 
above to the value function I 4 associated to a rational controller u = a/p with a G and 

p G M[x] satisfying 0 < cTj < hp on X. In addition we describe a hierarchy of approximations 
from below to the optimal value function V. This is useful as a post-processing step, once 
a rational control law has been computed as described in Section providing an explicit 
bound on the suboptimality of the controller. 
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Note that, trivially, approximations from above to Vu provide approximations from above 
to V. Defining / = p/ + ^ ^ ^ degree d 

polynomial upper and lower bounds are given by 


_min 

vyeK[a;]d 

s.t. 


Jx Vuix) dx 


/3pK-VK-/-/eQ,(X) 


Vu — M ^ Qd{X) + g^d-Aegg) 


(28) 


and 


max 

VugR[x]rf 

S.t. 


-{/3pVu-VVu-f-i)eQd{X) 
M — Vu^ Qd{X) + gM.d-degg, 


(29) 


respectively. Fixing a basis of Mpjrf, the objective functions of (28) and (29) become linear 
in the coefficients of 14 respectively 14 in this basis. Problems (28) and (29) are therefore 
convex SOS problems and immediately translate to SDPs (see Section 2.2 . 


Theorem 4 Let 14^ and VY 
14 > 14 on X and 


denote solutions to (28) and (29) of degree d. 


Then vj > 


lim 

d^oo 


Vu (x) dx 


lx 



Vu{x) dx 


lim 

d^oo 



Vu‘^{x) dx. 


(30) 


Proof: See Appendix A. □ 

As a simple corollary we obtain a converging sequence of polynomial over-approximations 
to V, the optimal value function of (|^: 


— d2 

Theorem 5 Let V^d^ denote the degree d 2 polynomial approximation from above to the 
value function associated to the rational controller obtained from (10) using (11). Then 
V%^ >V on X and 

lim lim f (V'^di^x) — V{x)) dx = 0. 

di^oo d2^oo J 

Now we describe a hierarchy of lower bounds on V: 

max Vix) dx 

ZeR[x]d,peR[x]- 

s.t. 


h-(8v + xv■f + uZT=lP^^ Qdix) 

lu, + XV-fu,-p,eQd{X) 

—pi G Qd{X) 

M — 1/ G Qd{X) + gM.d-degg- 


(31) 


Theorem Q If V^ feasible in (31), then V<V on X. 
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Proof: Follows by similar arguments based on Gronwall’s Lemma as in the proof of Theo¬ 
rem HI □ 


Remark 7 The question whether]^ converges from below to V as degree d in (31) tends to 
infinity is open (although likely to hold). A proof would require an extension of the superpo¬ 
sition Theorem^to non-Lipschitz vector fields (in the spirit of the finite-time superposition 
result of lU Theorem 4-4]) or an extension of the argument of to the case of /i^ 7^ 0, 
either of which is beyond the scope of this paper. 


Remark 8 Besides closed-loop cost function with respect to the OCP one can assess 
other aspects of the closed-loop behavior of the dynamical system & controlled by the rational 
controller u = alp. In particular, regions of attraction or maximum controlled invariant sets 
can be estimated by methods which extend readily to the case of rational systems. 


7 Numerical examples 

This section demonstrates the approach on numerical examples. To improve the numerical 
conditioning of the SDPs solved, we use the Chebyshev basis to parametrize all polynomials. 
More specihcally, we use tensor products of univariate Chebyshev polynomials of the hrst 
kind to obtain a multivariate Chebyshev basis. We note, however, that similar results, albeit 
slightly less accurate could be obtained with the standard multivariate monomial basis (in 
which case the SDPs can be readily formulated using high level modelling tools such as 
Yalmip [H] or SOSOPT [IT]). The resulting SDPs were solved using MOSEK. 

7.1 Nonlinear double integrator 

As our first example we consider the nonlinear double integrator 

Xi = X 2 + 0.1a:f 

X2 = 0 . 3 m 


subject to the constraints u G [—1,1] and x E X := {x : ||x||2 < 1} and stage costs 
lx{x) = x~^x and lu{,x) = 0. The discount factor fi was set to 1; the constant M to 1.01 > 
sup3,g^{a;^x}//9 = 1. First we obtain a rational controller of degree six by solving with 
d = 6. The graph of the controller is shown in Figure Next we obtain a polynomial upper 
bound Vu of degree 14 on the value function associated to u by solving ( |^ with d = 14. To 
assess suboptimality of the controller u we compare it with a lower bound V) on the optimal 
value function of the problem (|^ obtained by solving (31) with d = 14. The graphs of the 
two value functions are plotted in Figure]^ We see that the gap between the upper bound on 
Vu and lower bound on V is relatively small, verifying a good performance of the extracted 
controller. Quantitatively, the average performance gap defined as 100 fx(Ki — V)dx/ Vdx 
is equal to 19. 
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Figure 1: Nonlinear double integrator - rational controller of degree six. 



Figure 2: Nonlinear double integrator - upper bound on the value function I 4 associated to the 
extracted controller (red); lower bound on the optimal value function V (blue). 

7.2 Controlled Lotka-Volterra 

In our second example we apply the proposed method to a population model governed by 
n-dimensional controlled Lotka-Volterra equations 

x = roxo(l — Ax) -h u~^ — u~, 

where 1 G M" is the vector of ones and o denotes the componentwise (Hadamard) product. 
Each component Xi of the state x G M"" represents the size of the population of species i. The 
vector r G M” contains the intrinsic growth rates of each species and the matrix A G 
captures the interaction between the species. If Aij > 0, then species j is harmful to species 
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i (e.g., competes for resources) and if Ai^j < 0, then species j is helpful to species i (e.g., 
species i feeds on species j); the diagonal components Ai^i are normalized to one. The control 
inputs u'^ G [0,1]” and u~ G [0,1]"' represent, respectively, the inflow and outflow of new 
species from the outside. For our numerical example we select n = 4 and model parameters 


■ 1 ■ 


1 

0.3 

0.4 

0.2 ■ 

0.6 

, 41 = 

-0.2 

1 

0.4 

-0.1 

0.4 

-0.1 

-0.2 

1 

0.3 

0.2 


-0.1 

-0.2 

-0.3 

1 


which results in a system with four states and eight control inputs. The economic objective is 
to harvest species number one while ensuring that no species goes extinct. More specihcally 
the cost function is /^(x) = (—1.0, 0.5, 0.6,0.8,1.1, 2,4, 6) and lx{x) = 1, where the vector 
lu{x) is associated with the control input vector u = {u~,u'^). Therefore there is a reward 
for harvesting species number one and cost associated with both introduction and hunting 
of all other species, the cost of hunting being lower than the cost of introduction. The 
reason for choosing lx{x) = 1 is in order to make the joint stage cost l{x,u) ([^ nonnegative; 
this choice does not affect optimality since lx{x{t)) = 1 irrespective of the control input 
applied. The non-extinction constraint is expressed as g{x) = 1 — {Q~^x — q)~^{Q~^x — q) > 
0 with Q = diag(0.475 ■ 1) and q = 0.525 ■ 1. We choose (3 = 1 and M = 16.16 > 
ueuiK'^y ^)}/(^ = 16. We apply the coordinate transformation x = Qx + q and solve 
solve obtain a rational controller of degree eight by solving ( pT| . Figure]^ we shows plots for 
two different initial conditions, one with low population size of the hrst species and one with 
high. Finally, we evaluate the suboptimality of the extracted controller using the polynomial 
lower bound on the optimal value function of degree 11 obtained from (31). Using Monte 
Carlo simulation with 1000 samples of initial conditions drawn from a uniform distribution 
over the constraint set we obtain average cost of the extracted controller to be 0.89 whereas 
the lower bounds predicts average cost of 0.72; hence the extracted controller is no more than 
23.6 % suboptimal (modulo the statistical estimation error). Note that we could also obtain 
a deterministic suboptimality estimate using the upper bound on the value function of the 
extracted controller obtained from (28). In this case, however, the upper bound (28) is not 
informative. Nevertheless, the Monte Carlo simulation along with the lower bound (31) is a 
viable alternative in this case, since the extracted controller is simple and hence trajectories 
of the controlled system can be simulated rapidly. 


8 Conclusion 

This paper presented a method to obtain a sequence of rational controllers asymptotically 
optimal (under suitable technical assumptions) in a discounted optimal control problem and 
a method to explicitly estimate suboptimality of each controller. The rational controller of a 
given degree is obtained by solving a single sum-of-squares problem with no extraction step. 
The SOS problem solved is feasible for any degree and therefore this method allows to trade 
off complexity of the controller against performance. 

The approach is based on lifting the nonconvex optimal control problem into an inhnite di¬ 
mensional space of measures with continuous densities, where this problem becomes linear. 
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Figure 3: Controlled Lotka-Volterra - (blue) trajectory starting from a high initial population 
of the first species and low initial population of the other species; (red) trajectory starting 
from low initial population of the first species and high initial population of the other species. 

Crucially, this problem is a tightening of the original problem, which follows immediately 
from the representation result for solutions of the discounted Liouville’s equation with a ter¬ 
minal measure (Theorem]^. Asymptotic optimality of the extracted controllers then follows 
by approximating the asymptotically optimal continuous densities (guaranteed to exist by 
Assumption with polynomial densities in such a way that these densities correspond to 
the densities of the dynamical system (this is the essence of the proof of Theorem [^. 


9 Appendix A 


This Appendix contains the proof of Theorem we use the same notation as in Section 
The inequalities Vu > lA > follow from Gronwall’s Lemma by noticing that the con¬ 


straints of and ( p9| ) imply that 

-d 

■(/ + 

i=l 




Z=1 

m 




(32) 


(33) 


2 = 1 


2=1 


on X and 14 > Af, 14 < M on dX. We detail the argument for the inequality 14 > 14, 
the inequality 14 > 14being similar. Given xq G X the inequality (32) implies that 


d —d 


dt 


14 {x{t\xo)) < /3Vu {x{t\xo))- la,{x{t\xo)) + '^lui{x{t\xo))Ui{x{t\xo)) 


2=1 


and therefore by Gronwall’s Lemma 

j-t nfi 

Vu {x{t\xQ)) < e^^Vu {xq) - / 4(a;(s | xq)) + X] 4i(2^('51 a;o))Mi(2^(s I a^o)) 


2=1 


ds 
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and hence 


Vu{xQ)>e {x{t\xQ)) + e /^(a;(s | Xq)) + ^| a:o))Mi(x(s | a:o)) ds ( 34 ) 


2=1 


for all t G [0, r], where r := inf{t > 0 | x(t \ Xq) ^ X} G [0, oo] is the first exit time of X. 
Next we observe that 14, the valne fnnction associated to u, is eqnal to 


14(2:0) = 


/c 


~ g-/36 


4(x(s|xo)) + Y.7=i lui{x{s\xo))ui{x{s\xo)) 


ds, 


+ lo ^ lj,{x{s\xo))+ YlT=l^Ui{^i^\^o))Ui{x{s\Xo)) 


T = 00 


ds, T < 00. 


In view of (34), we conclnde that Vu{xq) > Vu{xq) if r = cxd since 14*^ is polynomial and hence 
bonnded on X (and hence e“^*14 (a;(t|2:o)) 0); and we conclnde that I4 (xo) 4 14(2:0) if 

r < 00 since x(r | Xq) G dX and I4 > M on dX. 

Convergence of the npper and lower bonnds (30) follows from Theorem nsing infinite¬ 
dimensional LP dnality and standard resnlts on the convergence of moment relaxations. 
The proof is similar to the proof of Theorem 5 in or Theorem 3.6 in [13] and therefore we 
only ontline it. The hierarchy of SOS programming problems (28) and (29) is dnal to the 


hierarchy of moment relaxations of an inhnite-dimensional LP in the cone of nonnegative 
measnres whose dnal is an inhnite-dimensional LP in C^{X) and feasible solntions of this 
dnal provide npper or lower bonnds on I4. Crncial to applying inhnite-dimensional dnality 
resnlts (e.g., jS] Theorem 3.10]) is the bonndedness of measnres satisfying the disconnted 


Lionville eqnation (14) with z/* < n/x and /xq = Ax, where Ax is the restriction of the 
Lebesgne measnre to X. Plngging n = 1 in (14) we have /xt(AC) -|- /3/x(X) = /xo(X). Since 


Ao(.’f) = Ax (AC) = vol X < 00 and /d > 0 we conclnde that ht and /x are indeed bonnded, 
which implies that z/j is also bonnded for i = 1 ,... ,m. Eqnally important is the absence of 
dnality gap between the hnite-dimensional moment relaxations and SOS tightenings (which 
are both SDP problems); this follows immediately from the presence of the constraint gt = 
X—||x||2 among the constraints describing X, which implies the bonndedness of the trnncated 
moment seqnences feasible in the moment relaxations. The absence of dnality gap then 
follows from [231 Lemma 2]. □ 


10 Appendix B 


This appendix presents a proof of Theorem We will prove a slightly more general version 
of the resnlt from which Theorem immediately follows: 

Theorem 7 Let f : ML —)■ M^ be globally Lipschitz and let the nonnegative measures /x, /xq, 
fiT on M" satisfy 


V dg,T = / vdfio+ / (XV ■ f — ( 3 v) dfi 


(35) 


for all V G C^(M”). Then there exists an ensemble of probability measures {Txgjxoex with 
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spt Txq C [0, oo] and an ensemble of trajectories {a;(-1 xo)}xoex of the ODE x = f{x) and 


v(x) djUo(x) = 

i 

v(x) d^{x) = 

in 

v{x) dfixix) = 


^(^(Olxo)) dfio{xo), 

e~^%{x{t\xo)) dt dTxoir) d^o{xo), 
e~^'^v{T{xo)) dTxoir) dno{xo), 


roo /*T 


'0 ^0 

poo 


(36a) 

(36b) 

(36c) 


'0 


for alive L^M”) 


Theorem follows from Theorem 0 by setting f = f + Yh=i fuiUi and modifying / and 
fui outside the compact set X such that / is globally Lipschit:^ The conclusion that 
x{t\xo) G X for all t G sptr^g follows by taking v{x) = /]Rn\x(a;) in (36), where Ia is 

the indicator function of a set A, i.e. Ia{x) = 1 if x E A and Ia{x) = 0 otherwise. 


Suppose therefore that (36) holds. First we will prove a simple result. In the rest of this 
Appendix we will use the notation for the space of all compactly supported fc-times 
continuously differentiable functions. 


Lemma 4 For any w G C'^(]R"'), the equation 

Xv ■ f — (3v = w 

has a solution v such that for all xq G M" it holds 

POO 

v{xo) = — e~^^w{x{t\xQ)) dt. 

Jo 

Proof: Since / is globally Lipschitz the solution x(t \ Xq) is dehned for all Xq G 


(37) 


(38) 


and 


all f > 0. Therefore (38) is well dehned (notice that w is bounded and /? > 0). Direct 


^Such modification is always possible. For instance let f{x) = mmy^x{f{y) + XfiLi fui{y)ui{y) + L||a; — 
?/||}, where L is the Lipschitz constant of / + ffJiLi fuiUi on X. 
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computation then gives: 


Vv ■ f{x{t I Xo)) = —v{x{t\xo)) 

d 


= —— J e ^^w{x{s\x{t\xo))) ds 

d r°° 

= —— / e~^^w{x{t + s\xo)) ds 


dt 


e ^^Vw{x{t + s\xq)) ■ f{x{t + s\xo)) ds 


= — e + s\xo)) ■ f{x{t + s\xo)) ds 

Jo 

d 

= — —w(x(t + s\xo)) ds 

Jo ds 

poo 

=—fJ / e~^'^w{x{t + s\xo)) ds — [e~^'^w{x{t + s\xo))] 


OO 

0 


= —/S / e ^^w{x{s\x{t\xo))) ds + w{x{t\xo)) 

Jo 

= I3v{x{t I Xq)) + w{x{t I Xo)). 


Setting t = 0, we arrive at (37). 


Proof (of Theorem We will proceed in several steps. 


□ 


Two Diracs. We start with the simplest case of /iq = and /i^ = a > 0, xt ^ 


and some /i > 0. First we will show that if {ht-, i^oi P) solves (35) then there exists a time 
r > 0 such that x{t \ Xq) = xt- Consider now any w G C'^(]R"), tn > 0 and the associated 
V G solving (37). Then we have 


av{xT) — v{xq) = / {'Vv ■ f — l3v) d^ = / wd^>0. 
Therefore, by Lemma 

poo 

av{xT) > v{xq) = — e~^^w{x{t\xo)) dt. 

Jo 


Using (38) again on v{xt) we get 

poo poo 

—a / e~^^w{x{t\xT)) dt > — / e~^^w{x{t\xo)) dt, 


or 


a e ^^w{x{t\xT)) dt < / e ^^w{x{t\xo)) dt. 

Jo Jo 

Now pick S' > 0 (to be specihed later) and consider the traces 

To = {x{t I xo) I 0 < t < S}. 


(39) 
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Xt = {x{t I xt) I 0 < t < S}. 

Assuming there is no r > 0 such that x(r | xq) = xr we have Xq fl Xt = 0 and since 
Xo and Xt are compact there exist (by Uryshon’s Lemma with mollihcation) a function 
w G [0,1]) such that w = 0 on Aq and tn = 1 on Xt- Then the left hand side of (39) 

is greater than or equal to a(l — whereas the right hand side is less than or equal 

to / [ 3 . Since a > 0 and /9 > 0 we arrive at a contradiction with (39) by picking a 

sufficiently large S. Therefore there exists a r > 0 such that x(r | a:o) = xt (i.e., xt and xq 
are on the same trace of the flow associated to x = f{x)). 

Now we prove that a < . Since xt = x{t) and xq are on the same trace we have 


v{xq) = e v{xt) 

vix{T)) 

Using again av{xT) > v{xo) if tn > 0 we get 


w{x(t I Xq)) dt. 


av{xT) > e — / w{x{t\xo)) dt, or 

Jo 

poo pT 

— a) / e~^^w{x{t \xT))dt > — / w{x{t\xo)) dt. 


(40) 


Consider the set 

Xr = {x{t I Xq) I 0 < f < r}. 

Since xq and xt are on the same trace (and xt follows xq) there exists w G Cl{X), w > 0, 
such that w = t] on Xr and w > 0 elsewhere (e.g., let w{x) = min(dist(x, TV), 1) with 
appropriate mollihcation). With this choice of w the equation (40) gives 

poo 

— a) / e~^^w{x{t \xT))dt > 0 

Jo 

and therefore a < e~^'^ since the integral is strictly positive. This proves the hrst two claims. 


To hnish we observe that any solution to (37) satishes 

e“^’^u(xT) = u(xo) + [ e~^^w{x{t\xo)) dt. 


Therefore 


au(xr) = u(xo)ae^’^ + / e ^^w{x{t\xo)) dt. 


Using (38) we get 


au(x'r) = u(xo) + / e ^*w(x(f | Xq)) dt + (1 — ae^”^) / e ^*tn(x(t | Xq)) dt. 


>0 


>0 


Since 


av{xT) — v(xo) 


w djj, 
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we conclude that 


w dfj, = ae^'^ / e ^^w{x{t\xQ)) dt + {1 — ae^'^) / e ^^w{x{t\xQ)) dt, 


i.e., /i is indeed generated by trajectories of i; = f\x) (in this case by two trajectories, both 
starting at Xq, one stopping at r, the other one continuing to infinity with weights given by 
the ratio of masses of /io and /ir). That is the measure is given by 


+ (1 - ae^'^)(5oo 


as expected. 


Dirac at xq, sum of Diracs for Next we treat the case where ht = for 

some Oj > 0 and Xi G Using the same argument as in the previous case we can show 
that 

Xi G Xq = {x(t I Xo) I t > 0} 

for all i and that the condition 

OO 

< 1 , 

i=l 

holds with Ti being the times to reach Xi from xq. Then we have 

OO OO 

Txo = + (1 - 

i=l i=l 


Dirac at xq arbitrary ht- In the same way as before we can show that the support of 
Ht must be on the trace Xq. Then we can dehne the measure by 


fxo(A) := /it(x(A I Xo)), A C [0, oo) 

and show that it has to satisfy the condition e^^df^^it) < 1. Next, using the fact that 
the mapping t h-)■ x(t\xo) is invertible, we obtain 

p i*00 

/ vd^iT= / v{x{t\xo))dfxo{t). 

JMX Jo 

The conclusion of the theorem then holds with Txq dehned by 


Tx^{A)= / lA{t)e^^dfxo{t) + 


e^^dfxoit)] Ia{oo), a C [0, cx)]. 


Arbitrary /io, arbitrary ht- The general case follows by approximating /io by a sum of 
Dirac measures, using the fact that any measure is the weak limit of a sequence of Dirac 
measures. 
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